Skip to content

Conversation

@Maegereg
Copy link
Contributor

@Maegereg Maegereg commented Nov 1, 2025

This has function does not match the behavior of __eq__ - non exact arbs with the same midpoint and radius will hash to the same value even though they're not mathematically equal. I considered having non-exact arbs return random hash values, but that feels wrong (though I do note that the python documentation doesn't explicitly forbid it), and it wouldn't even always work (hash collisions do exist). I also considered hashing the object id for non-exact arbs, but non-exact arbs aren't equal to themselves, so that doesn't quite work correctly either. Overall, this seemed like the simplest solution, even if it does have the possibility to create poor performance if you try to use non-exact arbs in a hashing data structure.

@oscarbenjamin
Copy link
Collaborator

Given that arbs don't compare equal to themselves it probably doesn't make sense to use them in a hashing data structure. The behaviour of list and dict is that they check for (truncated) hash equality first, then identity, then equality. This means that with this PR an inexact arb behaves like a nan:

In [1]: import flint

In [2]: a1 = flint.arb(10,1)

In [3]: a2 = flint.arb(10,1)

In [4]: a1 in {a1}
Out[4]: True

In [5]: a1 in {a2}
Out[5]: False

The question is whether this is actually useful. Before you mentioned caching and in that sense it can be useful that the lookup sometimes works even if it sometimes fails.

What is the actual use case here though? Is it just caching?

I wonder if it would make more sense to handle that use case differently. There is a question about what a == b should do for arbs and only one possible behaviour can be chosen. Clearly though there is also a need to have a way of comparing arbs for structural equality (same midpoint and radius). Maybe it would be better to add methods for that explicitly like:

  • An equal_repr method or something for checking that things have the same type and representation.
  • A hash_key method that gives a hashable object whose equality comparison works structurally (for an arb this would be the tuple (mid.man_exp(), rad.man_exp())).

Some tests already use an _equal_repr method because this is needed when you want to assert the type as well as the value which assert a == b does not. Maybe this should be made into a public method that is provided for all types.

I think it also makes sense that all types could have something like a sort_key method that enables sorting into an arbitrary but always defined order even for types where a < b is mathematically ambiguous or undefined such as matrices, finite fields, polynomials.

Some types such as matrices and polynomials are mutable and so should probably not have __hash__ but could perhaps have a a.hash() method for computing a hash any way in situations where we want them to be treated as immutable e.g. for a cache lookup. That could perhaps just be provided by a hash_key method though.

Ideally Python would have some separate idea of equality like a === b which would be the kind of equality that is used for hash tables like sets and dicts and that uses a different method from a == b.

@Maegereg
Copy link
Contributor Author

Maegereg commented Nov 3, 2025

Given that arbs don't compare equal to themselves it probably doesn't make sense to use them in a hashing data structure. The behaviour of list and dict is that they check for (truncated) hash equality first, then identity, then equality. This means that with this PR an inexact arb behaves like a nan:

Oh, that is rough. I agree that isn't desirable.

Ideally Python would have some separate idea of equality like a === b which would be the kind of equality that is used for hash tables like sets and dicts and that uses a different method from a == b.

If I were designing arb from scratch, I suspect I'd have __eq__ be structural equality and have mathematical equality be another method. But at this point that would be quite the breaking change, not to mention a bit counterintuitive.

What is the actual use case here though? Is it just caching?

For my purposes, yes. We have a method that takes (exact) arbs as parameters, and we want to stick a @lru_cache on it for performance reasons. I can speculate that someone might want to hash arbs in other contexts, but I don't have a concrete use case.

I don't think a .hash_key() method works for that. Maybe you could do something similar though - I'm imagining a wrapper or subclass that has a hash method defined, and equality defined as structural equality. You could then make it very easy to convert back and forth between the two types. Not perfect, because you would have to convert back and forth, but it gives you reasonable behavior with hashing types.

@oscarbenjamin
Copy link
Collaborator

I don't think you can really win either way with the question of structural vs mathematical equality. There will always be people who want or expect one rather than the other.

The hash method here is consistent with __eq__ and works for exact arbs so I don't object to it. I just wanted to point out that the behaviour with inexact arbs may be surprising. There isn't really a way around that though either because you get the same thing without hashing:

In [2]: a1 = arb(2, 1)

In [3]: a2 = arb(2, 1)

In [4]: a1 in [a1]
Out[4]: True

In [5]: a2 in [a1]
Out[5]: False

@Maegereg
Copy link
Contributor Author

Maegereg commented Nov 4, 2025

Fair point. One other thought: we could keep this hash behavior, but have it raise an error for inexact arbs.

If the three options are:

  1. Sensible hash for exact arbs, error for inexact arbs.
  2. Sensible hash for exact arbs, nonsensical hash for inexact arbs (current version).
  3. Wrapper class that compares and hashes sensibly.

Which do you prefer?

I might lean slightly towards 1, since it's still relatively simple, but doesn't have weird edge cases. But I could be persuaded otherwise.

@oscarbenjamin
Copy link
Collaborator

Option 1 seems reasonable. It could be changed to not raises an error later if desired.

@Maegereg
Copy link
Contributor Author

Maegereg commented Nov 4, 2025

Done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants